295 research outputs found

    Entropic Wasserstein Gradient Flows

    Full text link
    This article details a novel numerical scheme to approximate gradient flows for optimal transport (i.e. Wasserstein) metrics. These flows have proved useful to tackle theoretically and numerically non-linear diffusion equations that model for instance porous media or crowd evolutions. These gradient flows define a suitable notion of weak solutions for these evolutions and they can be approximated in a stable way using discrete flows. These discrete flows are implicit Euler time stepping according to the Wasserstein metric. A bottleneck of these approaches is the high computational load induced by the resolution of each step. Indeed, this corresponds to the resolution of a convex optimization problem involving a Wasserstein distance to the previous iterate. Following several recent works on the approximation of Wasserstein distances, we consider a discrete flow induced by an entropic regularization of the transportation coupling. This entropic regularization allows one to trade the initial Wasserstein fidelity term for a Kulback-Leibler divergence, which is easier to deal with numerically. We show how KL proximal schemes, and in particular Dykstra's algorithm, can be used to compute each step of the regularized flow. The resulting algorithm is both fast, parallelizable and versatile, because it only requires multiplications by a Gibbs kernel. On Euclidean domains discretized on an uniform grid, this corresponds to a linear filtering (for instance a Gaussian filtering when cc is the squared Euclidean distance) which can be computed in nearly linear time. On more general domains, such as (possibly non-convex) shapes or on manifolds discretized by a triangular mesh, following a recently proposed numerical scheme for optimal transport, this Gibbs kernel multiplication is approximated by a short-time heat diffusion

    A Smoothed Dual Approach for Variational Wasserstein Problems

    Full text link
    Variational problems that involve Wasserstein distances have been recently proposed to summarize and learn from probability measures. Despite being conceptually simple, such problems are computationally challenging because they involve minimizing over quantities (Wasserstein distances) that are themselves hard to compute. We show that the dual formulation of Wasserstein variational problems introduced recently by Carlier et al. (2014) can be regularized using an entropic smoothing, which leads to smooth, differentiable, convex optimization problems that are simpler to implement and numerically more stable. We illustrate the versatility of this approach by applying it to the computation of Wasserstein barycenters and gradient flows of spacial regularization functionals

    Sparse Spikes Deconvolution on Thin Grids

    Full text link
    This article analyzes the recovery performance of two popular finite dimensional approximations of the sparse spikes deconvolution problem over Radon measures. We examine in a unified framework both the L1 regularization (often referred to as Lasso or Basis-Pursuit) and the Continuous Basis-Pursuit (C-BP) methods. The Lasso is the de-facto standard for the sparse regularization of inverse problems in imaging. It performs a nearest neighbor interpolation of the spikes locations on the sampling grid. The C-BP method, introduced by Ekanadham, Tranchina and Simoncelli, uses a linear interpolation of the locations to perform a better approximation of the infinite-dimensional optimization problem, for positive measures. We show that, in the small noise regime, both methods estimate twice the number of spikes as the number of original spikes. Indeed, we show that they both detect two neighboring spikes around the locations of an original spikes. These results for deconvolution problems are based on an abstract analysis of the so-called extended support of the solutions of L1-type problems (including as special cases the Lasso and C-BP for deconvolution), which are of an independent interest. They precisely characterize the support of the solutions when the noise is small and the regularization parameter is selected accordingly. We illustrate these findings to analyze for the first time the support instability of compressed sensing recovery when the number of measurements is below the critical limit (well documented in the literature) where the support is provably stable

    Compressive Wave Computation

    Full text link
    This paper considers large-scale simulations of wave propagation phenomena. We argue that it is possible to accurately compute a wavefield by decomposing it onto a largely incomplete set of eigenfunctions of the Helmholtz operator, chosen at random, and that this provides a natural way of parallelizing wave simulations for memory-intensive applications. This paper shows that L1-Helmholtz recovery makes sense for wave computation, and identifies a regime in which it is provably effective: the one-dimensional wave equation with coefficients of small bounded variation. Under suitable assumptions we show that the number of eigenfunctions needed to evolve a sparse wavefield defined on N points, accurately with very high probability, is bounded by C log(N) log(log(N)), where C is related to the desired accuracy and can be made to grow at a much slower rate than N when the solution is sparse. The PDE estimates that underlie this result are new to the authors' knowledge and may be of independent mathematical interest; they include an L1 estimate for the wave equation, an estimate of extension of eigenfunctions, and a bound for eigenvalue gaps in Sturm-Liouville problems. Numerical examples are presented in one spatial dimension and show that as few as 10 percents of all eigenfunctions can suffice for accurate results. Finally, we argue that the compressive viewpoint suggests a competitive parallel algorithm for an adjoint-state inversion method in reflection seismology.Comment: 45 pages, 4 figure

    Local Linear Convergence Analysis of Primal-Dual Splitting Methods

    Full text link
    In this paper, we study the local linear convergence properties of a versatile class of Primal-Dual splitting methods for minimizing composite non-smooth convex optimization problems. Under the assumption that the non-smooth components of the problem are partly smooth relative to smooth manifolds, we present a unified local convergence analysis framework for these methods. More precisely, in our framework we first show that (i) the sequences generated by Primal-Dual splitting methods identify a pair of primal and dual smooth manifolds in a finite number of iterations, and then (ii) enter a local linear convergence regime, which is characterized based on the structure of the underlying active smooth manifolds. We also show how our results for Primal-Dual splitting can be specialized to cover existing ones on Forward-Backward splitting and Douglas-Rachford splitting/ADMM (alternating direction methods of multipliers). Moreover, based on these obtained local convergence analysis result, several practical acceleration techniques are discussed. To exemplify the usefulness of the obtained result, we consider several concrete numerical experiments arising from fields including signal/image processing, inverse problems and machine learning, etc. The demonstration not only verifies the local linear convergence behaviour of Primal-Dual splitting methods, but also the insights on how to accelerate them in practice

    Learning Generative Models with Sinkhorn Divergences

    Full text link
    The ability to compare two degenerate probability distributions (i.e. two probability distributions supported on two distinct low-dimensional manifolds living in a much higher-dimensional space) is a crucial problem arising in the estimation of generative models for high-dimensional observations such as those arising in computer vision or natural language. It is known that optimal transport metrics can represent a cure for this problem, since they were specifically designed as an alternative to information divergences to handle such problematic scenarios. Unfortunately, training generative machines using OT raises formidable computational and statistical challenges, because of (i) the computational burden of evaluating OT losses, (ii) the instability and lack of smoothness of these losses, (iii) the difficulty to estimate robustly these losses and their gradients in high dimension. This paper presents the first tractable computational method to train large scale generative models using an optimal transport loss, and tackles these three issues by relying on two key ideas: (a) entropic smoothing, which turns the original OT loss into one that can be computed using Sinkhorn fixed point iterations; (b) algorithmic (automatic) differentiation of these iterations. These two approximations result in a robust and differentiable approximation of the OT loss with streamlined GPU execution. Entropic smoothing generates a family of losses interpolating between Wasserstein (OT) and Maximum Mean Discrepancy (MMD), thus allowing to find a sweet spot leveraging the geometry of OT and the favorable high-dimensional sample complexity of MMD which comes with unbiased gradient estimates. The resulting computational architecture complements nicely standard deep network generative models by a stack of extra layers implementing the loss function
    • …
    corecore